# Dynamic Masked Attention
Doge 20M Chinese
Apache-2.0
The Doge model employs dynamic masked attention mechanisms for sequence transformation, with the option to use either multi-layer perceptrons or cross-domain mixture of experts for state transitions.
Large Language Model
Transformers Supports Multiple Languages

D
wubingheng
65
2
Doge 120M MoE Instruct
Apache-2.0
The Doge model employs dynamic masked attention mechanisms for sequence transformation and can use multi-layer perceptrons or cross-domain mixture of experts for state transitions.
Large Language Model
Transformers English

D
SmallDoge
240
1
Doge 320M Instruct
Apache-2.0
Doge 320M Instruct is a lightweight language model based on dynamic masked attention, trained with supervised fine-tuning (SFT) and direct preference optimization (DPO), suitable for question-answering and dialogue tasks.
Large Language Model
Transformers English

D
SmallDoge
12.61k
3
Doge 320M
Apache-2.0
Doge is a sequence transformation model that employs dynamic masked attention mechanisms, capable of state transitions using either multi-layer perceptrons or cross-domain mixture of experts.
Large Language Model
Transformers Supports Multiple Languages

D
SmallDoge
3,028
4
Doge 160M Reason Distill
Apache-2.0
Doge 160M Reasoning Distilled Version is a lightweight language model based on dynamic masked attention mechanism and cross-domain mixture of experts, focusing on reasoning and question-answering tasks.
Large Language Model
Transformers English

D
SmallDoge
26
4
Doge 160M
Apache-2.0
Doge 160M is a small language model that employs dynamic masked attention mechanisms, trained by the SmallDoge community, and supports text generation tasks.
Large Language Model
Transformers Supports Multiple Languages

D
SmallDoge
4,227
4
Doge 20M Instruct
Apache-2.0
Doge 20M is a small language model based on dynamic masked attention mechanism, supporting instruction following and Q&A tasks.
Large Language Model
Transformers English

D
SmallDoge
5,010
4
Featured Recommended AI Models